Search CORE

35 research outputs found

Controlling Styles in Neural Machine Translation with Activation Prompt

Author: Cheng Shanbo
Sun Zewei
Wang Mingxuan
Wang Yifan
Zheng Weiguo
Publication venue
Publication date: 28/05/2023
Field of study

Controlling styles in neural machine translation (NMT) has attracted wide attention, as it is crucial for enhancing user experience. Earlier studies on this topic typically concentrate on regulating the level of formality and achieve some progress in this area. However, they still encounter two major challenges. The first is the difficulty in style evaluation. The style comprises various aspects such as lexis, syntax, and others that provide abundant information. Nevertheless, only formality has been thoroughly investigated. The second challenge involves excessive dependence on incremental adjustments, particularly when new styles are necessary. To address both challenges, this paper presents a new benchmark and approach. A multiway stylized machine translation (MSMT) benchmark is introduced, incorporating diverse categories of styles across four linguistic domains. Then, we propose a method named style activation prompt (StyleAP) by retrieving prompts from stylized monolingual corpus, which does not require extra fine-tuning. Experiments show that StyleAP could effectively control the style of translation and achieve remarkable performance.Comment: Accepted by Findings of ACL 2023; The code is available at https://github.com/IvanWang0730/StyleA

arXiv.org e-Print Archive

Only 5\% Attention Is All You Need: Efficient Long-range Document-level Neural Machine Translation

Author: Cheng Shanbo
Huang Shujian
Liu Zihan
Sun Zewei
Wang Mingxuan
Publication venue
Publication date: 25/09/2023
Field of study

Document-level Neural Machine Translation (DocNMT) has been proven crucial for handling discourse phenomena by introducing document-level context information. One of the most important directions is to input the whole document directly to the standard Transformer model. In this case, efficiency becomes a critical concern due to the quadratic complexity of the attention module. Existing studies either focus on the encoder part, which cannot be deployed on sequence-to-sequence generation tasks, e.g., Machine Translation (MT), or suffer from a significant performance drop. In this work, we keep the translation performance while gaining 20\% speed up by introducing extra selection layer based on lightweight attention that selects a small portion of tokens to be attended. It takes advantage of the original attention to ensure performance and dimension reduction to accelerate inference. Experimental results show that our method could achieve up to 95\% sparsity (only 5\% tokens attended) approximately, and save 93\% computation cost on the attention module compared with the original Transformer, while maintaining the performance.Comment: Accepted by AACL 202

arXiv.org e-Print Archive

Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

Author: Cheng Shanbo
Huang Luyang
Sun Zewei
Wang Mingxuan
Wu Liwei
Zhu Yaoming
Publication venue
Publication date: 02/09/2023
Field of study

Multimodal machine translation (MMT) aims to improve translation quality by incorporating information from other modalities, such as vision. Previous MMT systems mainly focus on better access and use of visual information and tend to validate their methods on image-related datasets. These studies face two challenges. First, they can only utilize triple data (bilingual texts with images), which is scarce; second, current benchmarks are relatively restricted and do not correspond to realistic scenarios. Therefore, this paper correspondingly establishes new methods and new datasets for MMT. First, we propose a framework 2/3-Triplet with two new approaches to enhance MMT by utilizing large-scale non-triple data: monolingual image-text data and parallel text-only data. Second, we construct an English-Chinese {e}-commercial {m}ulti{m}odal {t}ranslation dataset (including training and testing), named EMMT, where its test set is carefully selected as some words are ambiguous and shall be translated mistakenly without the help of images. Experiments show that our method is more suitable for real-world scenarios and can significantly improve translation performance by using more non-triple data. In addition, our model also rivals various SOTA models in conventional multimodal translation benchmarks.Comment: 8 pages, ACL 2023 Findin

arXiv.org e-Print Archive

Zero-shot Domain Adaptation for Neural Machine Translation with Retrieved Phrase-level Prompts

Author: Cao Jun
Cheng Shanbo
Huang Shujian
Jiang Qingnan
Sun Zewei
Wang Mingxuan
Publication venue
Publication date: 23/09/2022
Field of study

Domain adaptation is an important challenge for neural machine translation. However, the traditional fine-tuning solution requires multiple extra training and yields a high cost. In this paper, we propose a non-tuning paradigm, resolving domain adaptation with a prompt-based method. Specifically, we construct a bilingual phrase-level database and retrieve relevant pairs from it as a prompt for the input sentences. By utilizing Retrieved Phrase-level Prompts (RePP), we effectively boost the translation quality. Experiments show that our method improves domain-specific machine translation for 6.2 BLEU scores and improves translation constraints for 11.5% accuracy without additional training

arXiv.org e-Print Archive

BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation

Author: Cheng Shanbo
Huang Degen
Huang Luyang
Kang Liyan
Peng Ningxin
Su Jinsong
Sun Zewei
Wang Mingxuan
Zhu Peihao
Publication venue
Publication date: 09/06/2023
Field of study

We present a large-scale video subtitle translation dataset, BigVideo, to facilitate the study of multi-modality machine translation. Compared with the widely used How2 and VaTeX datasets, BigVideo is more than 10 times larger, consisting of 4.5 million sentence pairs and 9,981 hours of videos. We also introduce two deliberately designed test sets to verify the necessity of visual information: Ambiguous with the presence of ambiguous words, and Unambiguous in which the text context is self-contained for translation. To better model the common semantics shared across texts and videos, we introduce a contrastive learning method in the cross-modal encoder. Extensive experiments on the BigVideo show that: a) Visual information consistently improves the NMT model in terms of BLEU, BLEURT, and COMET on both Ambiguous and Unambiguous test sets. b) Visual information helps disambiguation, compared to the strong text baseline on terminology-targeted scores and human evaluation. Dataset and our implementations are available at https://github.com/DeepLearnXMU/BigVideo-VMT.Comment: Accepted to ACL 2023 Finding

arXiv.org e-Print Archive

Minor Components of Micropapillary and Solid Subtypes in Lung Adenocarcinoma are Predictors of Lymph Node Metastasis and Poor Prognosis

Author: Chao Cheng
Difan Zheng
Haiquan Chen
Hang Li
Lei Shen
Rui Wang
Shanbo Zheng
Ting Ye
Xuxia Shen
Yang Zhang
Yihua Sun
Yuan Li
Yue Zhao
Yunjian Pan
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Springer - Publisher Connector

Role of drugs used for chronic disease management on susceptibility and severity of COVID-19: A large case-control study

Author: Aithal Guruprasad P.
Cai Ting
Du Jingyuan
Hu Yaoren
Huang Kecheng
Jiang Fanrong
Jin Susu
Liang Lili
Tan Xiaoyan
Valdes Ana M
Vijay Amrita
Wang Hongxia
Wang Shanbo
Yan Huadong
Yang Shiqing
Zhang Shun
Zheng Nanhong
Publication venue: 'Wiley'
Publication date: 01/12/2020
Field of study

The study aimed to investigate whether specific medications used in the treatment chronic diseases affected either the development and/ or severity of COVID-19 in a cohort of 610 COVID-19 cases and 48,667 population-based controls from Zheijang, China. Using a cohort of 578 COVID-19 cases and 48,667 population-based controls from Zheijang, China we tested the role of usage of cardiovascular, antidiabetic and other medications on risk and severity of COVID 19. Analyses were adjusted for age, sex and BMI and for presence of relevant comorbidities. Individuals with hypertension taking calcium channel blockers had significantly increased risk [odds ratio (OR)= 1.73 (95% CI 1.2-2.3)] of manifesting symptoms of COVID-19 whereas those taking angiotensin receptor blockers and diuretics had significantly lower disease risk (OR=0.22; 95%CI 0.15-0.30 and OR=0.30; 95%CI 0.19-0.58 respectively). Among those with type 2 diabetes, dipeptidyl peptidase-4 inhibitors (OR= 6.02; 95% CI 2.3- 15.5) and insulin (OR= 2.71; 95% CI 1.6-5.5) were more and glucosidase inhibitors were less prevalent (OR= 0.11; 95% CI 0.1-0.3) among with COVID-19 patients. Drugs used in the treatment of hypertension and diabetes influence the risk of development of COVID-19, but, not its severity

Crossref

Repository@Nottingham

Minor Components of Micropapillary and Solid Subtypes in Lung Adenocarcinoma are Predictors of Lymph Node Metastasis and Poor Prognosis

Author: A Warth
A Yoshizawa
AD Campos-Parra
CF Mountain
Chao Cheng
DH Johnson
Difan Zheng
G Lee
G Scagliotti
GV Scagliotti
GV Scagliotti
H Ujiie
Haiquan Chen
Hang Li
J Nitadori
JJ Hung
K Araki
Lei Shen
MJ Cha
N Tsubokawa
PA Russell
Rui Wang
S Xu
Shanbo Zheng
T Ciuleanu
Ting Ye
WD Travis
WD Travis
X Yang
Xuxia Shen
Y Zhang
Y Zhang
Yang Zhang
YC Yeh
Yihua Sun
Yuan Li
Yue Zhao
Yunjian Pan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recent advances of PROTACs technology in neurodegenerative diseases

Author: Chao Wang
Dongming Xing
Shanbo Yang
Yujing Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/09/2023
Field of study

Neurodegenerative diseases, like Alzheimer's disease, Huntington's disease, Parkinson's disease, progressive supranuclear palsy, and frontotemporal dementia are among the refractory diseases that lack appropriate drugs and treatments. Numerous disease-causing proteins in neurodegenerative diseases are undruggable for traditional drugs. Many clinical studies of drugs for Alzheimer's disease have failed, and none of the substances that slowed the amyloid-β (Aβ) accumulation process have been approved for use in clinical trials. A novel approach to addressing this issue is Proteolysis targeting chimeras (PROTACs) technology. PROTACs are heterogeneous functional molecules joined by a chemical linker and include binding ligands for the target protein and recruitment ligands for the E3 ligand. When a PROTAC binds to a target protein, the E3 ligand enzyme is brought into close contact and the target protein begins to be polyubiquitinated, followed by proteasome-mediated degradation. Numerous neurodegenerative disease-related targets, including α-Synuclein, mHTT, GSK-3, LRRK2, Tau, TRKA, and TRKC have been successfully targeted by PROTACs to date. This article presents a comprehensive overview of the development of PROTACs in neurodegenerative diseases. These PROTACs' chemical structures, preparative routes, in vitro and in vivo activities, and pharmacodynamics are outlined. We also offer our viewpoint on PROTACs' probable challenges and future prospects

Directory of Open Access Journals

Parity flipping mediated by a quantum dot in Majorana Josephson junctions

Author: Chow Shanbo
Wang Zhi
Yao Dao-Xin
Publication venue
Publication date: 07/03/2022
Field of study

With the increasing experimentally measurable signatures for Majorana bound states (MBSs), how to manipulate qubits by MBSs turns into a crucial issue. In this work, we introduce a quantum dot (QD) to Majorana Josephson junctions (MJJs). The parity characteristics of Majorana qubits can be manipulated by modulating the QD energy. We study the voltage induced by parity-flipping using fast or adiabatic evolution approach, and find an interesting

4\pi

-period hopping behavior of the electron occupying the QD at the robust

4\pi

-period voltage state, which can be applied to fabricate hybrid quantum circuits. We further investigate the Landau-Zener-St\"{u}celberg interference under distinct driving frequencies. It shows a cosinoidal QD occupation probability without parity-flipping and a controllable voltage state with parity-flipping, respectively. Furthermore, we find the Rabi oscillation in our system is suppressed by a damping. These properties can help to detect MBSs and realize quantum computation

arXiv.org e-Print Archive